16 research outputs found

    Models and Matheuristics for Large-Scale Combinatorial Optimization Problems

    Get PDF
    Combinatorial optimization deals with efficiently determining an optimal (or at least a good) decision among a finite set of alternatives. In business administration, such combinatorial optimization problems arise in, e.g., portfolio selection, project management, data analysis, and logistics. These optimization problems have in common that the set of alternatives becomes very large as the problem size increases, and therefore an exhaustive search of all alternatives may require a prohibitively long computation time. Moreover, due to their combinatorial nature no closed-form solutions to these problems exist. In practice, a common approach to tackle combinatorial optimization problems is to formulate them as mathematical models and to solve them using a mathematical programming solver (cf., e.g., Bixby et al. 1999, Achterberg et al. 2020). For small-scale problem instances, the mathematical models comprise a manageable number of variables and constraints such that mathematical programming solvers are able to devise optimal solutions within a reasonable computation time. For large-scale problem instances, the number of variables and constraints becomes very large which extends the computation time required to find an optimal solution considerably. Therefore, despite the continuously improving performance of mathematical programming solvers and computing hardware, the availability of mathematical models that are efficient in terms of the number of variables and constraints used is of crucial importance. Another frequently used approach to address combinatorial optimization problems are matheuristics. Matheuristics decompose the considered optimization problem into subproblems, which are then formulated as mathematical models and solved with the help of a mathematical programming solver. Matheuristics are particularly suitable for situations where it is required to find a good, but not necessarily an optimal solution within a short computation time, since the speed of the solution process can be controlled by choosing an appropriate size of the subproblems. This thesis consists of three papers on large-scale combinatorial optimization problems. We consider a portfolio optimization problem in finance, a scheduling problem in project management, and a clustering problem in data analysis. For these problems, we present novel mathematical models that require a relatively small number of variables and constraints, and we develop matheuristics that are based on novel problem-decomposition strategies. In extensive computational experiments, the proposed models and matheuristics performed favorably compared to state-of-the-art models and solution approaches from the literature. In the first paper, we consider the problem of determining a portfolio for an enhanced index-tracking fund. Enhanced index-tracking funds aim to replicate the returns of a particular financial stock-market index as closely as possible while outperforming that index by a small positive excess return. Additionally, we consider various real-life constraints that may be imposed by investors, stock exchanges, or investment guidelines. Since enhanced index-tracking funds are particularly attractive to investors if the index comprises a large number of stocks and thus is well diversified, it is of particular interest to tackle large-scale problem instances. For this problem, we present two matheuristics that consist of a novel construction matheuristic, and two different improvement matheuristics that are based on the concepts of local branching (cf. Fischetti and Lodi 2003) and iterated greedy heuristics (cf., e.g., Ruiz and Stützle 2007). Moreover, both matheuristics are based on a novel mathematical model for which we provide insights that allow to remove numerous redundant variables and constraints. We tested both matheuristics in a computational experiment on problem instances that are based on large stock-market indices with up to 9,427 constituents. It turns out that our matheuristics yield better portfolios than benchmark approaches in terms of out-of-sample risk-return characteristics. In the second paper, we consider the problem of scheduling a set of precedence-related project activities, each of which requiring some time and scarce resources during their execution. For each activity, alternative execution modes are given, which differ in the duration and the resource requirements of the activity. Sought is a start time and an execution mode for each activity, such that all precedence relationships are respected, the required amount of each resource does not exceed its prescribed capacity, and the project makespan is minimized. For this problem, we present two novel mathematical models, in which the number of variables remains constant when the range of the activities' durations and thus also the planning horizon is increased. Moreover, we enhance the performance of the proposed mathematical models by eliminating some symmetric solutions from the search space and by adding some redundant sequencing constraints for activities that cannot be processed in parallel. In a computational experiment based on instances consisting of activities with durations ranging from one up to 260 time units, the proposed models consistently outperformed all reference models from the literature. In the third paper, we consider the problem of grouping similar objects into clusters, where the similarity between a pair of objects is determined by a distance measure based on some features of the objects. In addition, we consider constraints that impose a maximum capacity for the clusters, since the size of the clusters is often restricted in practical clustering applications. Furthermore, practical clustering applications are often characterized by a very large number of objects to be clustered. For this reason, we present a matheuristic based on novel problem-decomposition strategies that are specifically designed for large-scale problem instances. The proposed matheuristic comprises two phases. In the first phase, we decompose the considered problem into a series of generalized assignment problems, and in the second phase, we decompose the problem into subproblems that comprise groups of clusters only. In a computational experiment, we tested the proposed matheuristic on problem instances with up to 498,378 objects. The proposed matheuristic consistently outperformed the state-of-the-art approach on medium- and large-scale instances, while matching the performance for small-scale instances. Although we considered three specific optimization problems in this thesis, the proposed models and matheuristics can be adapted to related optimization problems with only minor modifications. Examples for such related optimization problems are the UCITS-constrained index-tracking problem (cf, e.g., Strub and Trautmann 2019), which consists of determining the portfolio of an investment fund that must comply with regulatory restrictions imposed by the European Union, the multi-site resource-constrained project scheduling problem (cf., e.g., Laurent et al. 2017), which comprises the scheduling of a set of project activities that can be executed at alternative sites, or constrained clustering problems with must-link and cannot-link constraints (cf., e.g., González-Almagro et al. 2020)

    Tracking and outperforming large stock-market indices

    Get PDF
    Enhanced index-tracking funds aim to achieve a small target excess return over a given financial benchmark index with minimum additional risk relative to this index, i.e., a minimum tracking error. These funds are attractive to investors, especially when the index is large and thus well diversified. We consider the problem of determining a portfolio for an enhanced index-tracking fund that is benchmarked against a large stock-market index subject to real-life constraints that may be imposed by investors, stock exchanges, or investment guidelines. In the literature, various solution approaches have been proposed to enhanced index tracking that are based on different linear and quadratic tracking-error functions. However, it remains an open question which tracking-error function should be minimized to determine good enhanced index-tracking portfolios. Moreover, the existing approaches may neglect real-life constraints such as the minimum trading values imposed by stock exchanges or may not devise good feasible portfolios within a reasonable computational time when the index is large. To overcome these shortcomings, we propose novel mixed-integer linear and quadratic programming formulations and novel matheuristics. To address the open question, we minimize different tracking-error functions by applying the proposed matheuristics and exact solution approaches based on the proposed mixed-integer programming formulations in a computational experiment using a set of problem instances based on large stock-market indices with up to more than 9, 000 constituents. The results of our study suggest that minimizing the so-called tracking error variance, which is a quadratic function, is preferable to minimizing other tracking-error functions

    A matheuristic for large-scale capacitated clustering

    Get PDF
    Clustering addresses the problem of assigning similar objects to groups. Since the size of the clusters is often constrained in practical clustering applications, various capacitated clustering problems have received increasing attention. We consider here the capacitated p-median problem (CPMP) in which p objects are selected as cluster centers (medians) such that the total distance from these medians to their assigned objects is minimized. Each object is associated with a weight, and the total weight in each cluster must not exceed a given capacity. Numerous exact and heuristic solution approaches have been proposed for the CPMP. The state-of-the-art approach performs well for instances with up to 5,000 objects but becomes computationally expensive for instances with a much larger number of objects. We propose a matheuristic with new problem decomposition strategies that can deal with instances comprising up to 500,000 objects. In a computational experiment, the proposed matheuristic consistently outperformed the state-of-the-art approach on medium- and large-scale instances while having similar performance for small-scale instances. As an extension, we show that our matheuristic can be applied to related capacitated clustering problems, such as the capacitated centered clustering problem (CCCP). For several test instances of the CCCP, our matheuristic found new best-known solutions

    An implementation of the parallel schedule-generation scheme for applying Microsoft Excel's Evolutionary Solver to the resource-constrained project scheduling problem RCPSP

    No full text
    Since the 2010 version, the Solver Add-in of Microsoft Excel comprises the so-called Evolutionary Solver. The application of this Solver to a combinatorial optimization problem requires a spreadsheet which determines the objective function value corresponding to given values for the decision variables. This paper refers to the resource-constrained project-scheduling problem; we study how to implement the parallel schedule-generation scheme on a spreadsheet. We compare the performance against the serial schedule-generation scheme based on the j30 PSPLIB test set. It turns out that the CPU time required for scheduling an activity is considerably lower in the parallel than in the serial schedule-generation scheme; as a consequence, more schedules can be analyzed within a prescribed amount of time. For the novel implementation of the parallel scheme, the average deviation from the minimum makespan is considerably smaller than for the serial scheme, and the number of instances solved to optimality is surprisingly high

    Large-scale clustering using mathematical programming

    No full text
    Cluster analysis is a fundamental task in exploratory data analysis with a wide range of applications. Several clustering approaches based on mathematical programming have been proposed in the literature and were successfully used for small- and medium-scale data sets. However, mathematical programming-based clustering models are rarely used for large-scale data sets due to their extensive running time. In this paper, we propose a general scaling approach for existing mathematical programming-based clustering models that is based on the idea of replacing identical or nearly-identical objects by a small set of representatives. Our computational results indicate that the proposed scaling approach substantially reduces running time with a minor loss in clustering accuracy

    A Continuous-Time Mixed-Binary Linear Programming Formulation for the Multi-Site Resource-Constrained Project Scheduling Problem

    No full text
    The execution of a project is nowadays often distributed among multiple sites. While some resource units are available at a certain site only, other resource units can be moved across the sites. The problem considered here consists of scheduling a single projects’ activities which are interrelated by given precedence relationships of the completion-start type, require various renewable resource types during execution, and can be executed at the different sites of the project, such that the project makespan is minimized; transportation times must be taken into account if a resource unit is moved between two sites, or if two activities interrelated by a precedence relationship are executed at different sites. We present a continuous-time formulation of this problem as a mixed-binary linear program. In an experiment based on a set of 480 instances, we compared the performance of this novel formulation with a discrete-time formulation, which is the only formulation known from the literature; it turned out that when using the novel continuoustime formulation, considerably more instances can be solved to feasibility and to optimality, respectively
    corecore